我们的目标是使随机梯度$ \ sigma^2 $在随机梯度和(ii)问题依赖性常数中自适应(i)自适应。当最大程度地减少条件编号$ \ kappa $的平滑,强大的功能时,我们证明,$ t $ t $ toerations sgd的$ t $ toerations sgd具有指数降低的阶跃尺寸和对平滑度的知识可以实现$ \ tilde {o} \ left(\ exp) \ left(\ frac {-t} {\ kappa} \ right) + \ frac {\ sigma^2} {t} \ right)$ rate,而又不知道$ \ sigma^2 $。为了适应平滑度,我们使用随机线路搜索(SLS)并显示(通过上下距离),其SGD的SGD与SLS以所需的速率收敛,但仅针对溶液的邻域。另一方面,我们证明具有平滑度的离线估计值的SGD会收敛到最小化器。但是,其速率与估计误差成正比的速度减慢。接下来,我们证明具有Nesterov加速度和指数步骤尺寸(称为ASGD)的SGD可以实现接近最佳的$ \ tilde {o} \ left(\ exp \ left(\ frac {-t} {-t} {\ sqrt {\ sqrt {\ sqrt { \ kappa}}} \ right) + \ frac {\ sigma^2} {t} \ right)$ rate,而无需$ \ sigma^2 $。当与平滑度和强频率的离线估计值一起使用时,ASGD仍会收敛到溶液,尽管速度较慢。我们从经验上证明了指数级尺寸的有效性以及新型SLS的变体。
translated by 谷歌翻译
有限和最小化的方差减少(VR)方法通常需要对往复且难以估计的问题依赖性常数的知识。为了解决这个问题,我们使用自适应梯度方法的想法来提出ADASVRG,这是SVRG的更强大变体,即常见的VR方法。 ADASVRG在SVRG的内循环中使用Adagrad,使其稳健地选择阶梯大小。当最小化N平滑凸函数的总和时,我们证明了ADASVRG的变体需要$ \ TINDE {O}(N + 1 / ePSILON)$梯度评估,以实现$ O(\ epsilon)$ - 次优,匹配典型速率,但不需要知道问题依赖性常数。接下来,我们利用Adagrad的属性提出了一种启发式,可以自适应地确定ADASVRG中的每个内循环的长度。通过对合成和现实世界数据集的实验,我们验证了ADASVRG的稳健性和有效性,证明了其对标准和其他“无调谐”VR方法的卓越性能。
translated by 谷歌翻译
The recent increase in public and academic interest in preserving biodiversity has led to the growth of the field of conservation technology. This field involves designing and constructing tools that utilize technology to aid in the conservation of wildlife. In this article, we will use case studies to demonstrate the importance of designing conservation tools with human-wildlife interaction in mind and provide a framework for creating successful tools. These case studies include a range of complexities, from simple cat collars to machine learning and game theory methodologies. Our goal is to introduce and inform current and future researchers in the field of conservation technology and provide references for educating the next generation of conservation technologists. Conservation technology not only has the potential to benefit biodiversity but also has broader impacts on fields such as sustainability and environmental protection. By using innovative technologies to address conservation challenges, we can find more effective and efficient solutions to protect and preserve our planet's resources.
translated by 谷歌翻译
We address the problem of extracting key steps from unlabeled procedural videos, motivated by the potential of Augmented Reality (AR) headsets to revolutionize job training and performance. We decompose the problem into two steps: representation learning and key steps extraction. We employ self-supervised representation learning via a training strategy that adapts off-the-shelf video features using a temporal module. Training implements self-supervised learning losses involving multiple cues such as appearance, motion and pose trajectories extracted from videos to learn generalizable representations. Our method extracts key steps via a tunable algorithm that clusters the representations extracted from procedural videos. We quantitatively evaluate our approach with key step localization and also demonstrate the effectiveness of the extracted representations on related downstream tasks like phase classification. Qualitative results demonstrate that the extracted key steps are meaningful to succinctly represent the procedural tasks.
translated by 谷歌翻译
We introduce Argoverse 2 (AV2) - a collection of three datasets for perception and forecasting research in the self-driving domain. The annotated Sensor Dataset contains 1,000 sequences of multimodal data, encompassing high-resolution imagery from seven ring cameras, and two stereo cameras in addition to lidar point clouds, and 6-DOF map-aligned pose. Sequences contain 3D cuboid annotations for 26 object categories, all of which are sufficiently-sampled to support training and evaluation of 3D perception models. The Lidar Dataset contains 20,000 sequences of unlabeled lidar point clouds and map-aligned pose. This dataset is the largest ever collection of lidar sensor data and supports self-supervised learning and the emerging task of point cloud forecasting. Finally, the Motion Forecasting Dataset contains 250,000 scenarios mined for interesting and challenging interactions between the autonomous vehicle and other actors in each local scene. Models are tasked with the prediction of future motion for "scored actors" in each scenario and are provided with track histories that capture object location, heading, velocity, and category. In all three datasets, each scenario contains its own HD Map with 3D lane and crosswalk geometry - sourced from data captured in six distinct cities. We believe these datasets will support new and existing machine learning research problems in ways that existing datasets do not. All datasets are released under the CC BY-NC-SA 4.0 license.
translated by 谷歌翻译
In training neural networks, batch normalization has many benefits, not all of them entirely understood. But it also has some drawbacks. Foremost is arguably memory consumption, as computing the batch statistics requires all instances within the batch to be processed simultaneously, whereas without batch normalization it would be possible to process them one by one while accumulating the weight gradients. Another drawback is that that distribution parameters (mean and standard deviation) are unlike all other model parameters in that they are not trained using gradient descent but require special treatment, complicating implementation. In this paper, I show a simple and straightforward way to address these issues. The idea, in short, is to add terms to the loss that, for each activation, cause the minimization of the negative log likelihood of a Gaussian distribution that is used to normalize the activation. Among other benefits, this will hopefully contribute to the democratization of AI research by means of lowering the hardware requirements for training larger models.
translated by 谷歌翻译
In this paper, we introduce neural texture learning for 6D object pose estimation from synthetic data and a few unlabelled real images. Our major contribution is a novel learning scheme which removes the drawbacks of previous works, namely the strong dependency on co-modalities or additional refinement. These have been previously necessary to provide training signals for convergence. We formulate such a scheme as two sub-optimisation problems on texture learning and pose learning. We separately learn to predict realistic texture of objects from real image collections and learn pose estimation from pixel-perfect synthetic data. Combining these two capabilities allows then to synthesise photorealistic novel views to supervise the pose estimator with accurate geometry. To alleviate pose noise and segmentation imperfection present during the texture learning phase, we propose a surfel-based adversarial training loss together with texture regularisation from synthetic data. We demonstrate that the proposed approach significantly outperforms the recent state-of-the-art methods without ground-truth pose annotations and demonstrates substantial generalisation improvements towards unseen scenes. Remarkably, our scheme improves the adopted pose estimators substantially even when initialised with much inferior performance.
translated by 谷歌翻译
Prevailing methods for assessing and comparing generative AIs incentivize responses that serve a hypothetical representative individual. Evaluating models in these terms presumes homogeneous preferences across the population and engenders selection of agglomerative AIs, which fail to represent the diverse range of interests across individuals. We propose an alternative evaluation method that instead prioritizes inclusive AIs, which provably retain the requisite knowledge not only for subsequent response customization to particular segments of the population but also for utility-maximizing decisions.
translated by 谷歌翻译
We designed and constructed an A-sized base autonomous underwater vehicle (AUV), augmented with a stack of modular and extendable hardware and software, including autonomy, navigation, control and high fidelity simulation capabilities (A-size stands for the standard sonobuoy form factor, with a maximum diameter of 124 mm). Subsequently, we extended this base vehicle with a novel tuna-inspired morphing fin payload module (referred to as the Morpheus AUV), to achieve good directional stability and exceptional maneuverability; properties that are highly desirable for rigid hull AUVs, but are presently difficult to achieve because they impose contradictory requirements. The morphing fin payload allows the base AUV to dynamically change its stability-maneuverability qualities by using morphing fins, which can be deployed, deflected and retracted, as needed. The base vehicle and Morpheus AUV were both extensively field tested in-water in the Charles river, Massachusetts, USA; by conducting hundreds of hours of operations over a period of two years. The maneuvering capability of the Morpheus AUV was evaluated with and without the use of morphing fins to quantify the performance improvement. The Morpheus AUV was able to showcase an exceptional turning rate of around 25-35 deg/s. A maximum turn rate improvement of around 35% - 50% was gained through the use of morphing fins.
translated by 谷歌翻译
Imitation learning (IL) is a simple and powerful way to use high-quality human driving data, which can be collected at scale, to identify driving preferences and produce human-like behavior. However, policies based on imitation learning alone often fail to sufficiently account for safety and reliability concerns. In this paper, we show how imitation learning combined with reinforcement learning using simple rewards can substantially improve the safety and reliability of driving policies over those learned from imitation alone. In particular, we use a combination of imitation and reinforcement learning to train a policy on over 100k miles of urban driving data, and measure its effectiveness in test scenarios grouped by different levels of collision risk. To our knowledge, this is the first application of a combined imitation and reinforcement learning approach in autonomous driving that utilizes large amounts of real-world human driving data.
translated by 谷歌翻译